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Comparing  Regression  Equations  Across  Training  Programs: 
An  Empirical  Study  of  Prior  Selection  Effects  and 
Alternative  Prediction  Composites 


Introduction 


In  the  context  of  criterion-related  validity  studies,  the  importance 
of  comparisons  of  regression  equations  for  groups  differentiated  by 
characteristics  irrelevant  to  criterion  performance  has  been  underscored 
recently  in  the  new  Standards  for  Educational  and  Psychological  Testing 
(AERA,  APA  and  NCME,  1985).  When  feasible,  it  is  recommended  that  studies 
of  differences  between  prediction  systems  include  comparisons  of  predicted 
criterion  scores  at  various  points  on  the  regression  function  for  groups 
of  substantive  interest,  in  addition  to  the  more  common  comparisons  of 
validity  coefficients. 

It  has  been  clear  for  some  time  that  any  differences  observed  in  such 
comparisons  can  be  caused  by  a  number  of  factors,  not  all  of  which  make 
the  desired  assessments  of  differential  prediction  or  predictive  bias 
transparent.  Linn  and  Werts  (1971),  for  example,  showed  that  differences 
between  subgroup  regression  equations  can  be  caused  by  failure  to  include 
a  relevant  predictor  in  the  equations  being  compared,  i.e.  by  incorrect 
specification  of  the  prediction  model  being  determined.  This  is  the  case 
when  a  variable  that  is  related  to  performance  on  the  criterion  is  corre¬ 
lated  with  subgroup  membership  and  is  omitted  from  the  regression 
equation.  Indeed,  it  has  been  suggested  by  several  authors  (e.g.  Hunter  & 
Hunter,  1984  and  Gamache  &  Novick,  1985)  that  the  problem  posed  by  dif¬ 
ferences  between  subgroup  predictions  is  best  handled  by  respecifying  the 


prediction  model,  either  by  adding  or  deleting  appropriate  independent 
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variables  such  that  subgroup  differences  are  reduced.  Indeed,  Gamache  and 
Novick  (1985)  argue  that  differences  between  prediction  systems  are  often 
effected  by  the  inclusion  of  variables  having  weak  relationships  with  the 
criterion  and  strong  ones  with  group  membership. 

In  studies  of  predictive  bias  alluded  to  above,  another  potential 
cause  of  contrasting  regression  equations  is  the  presence  of  incidental 
selection  effects.  When  a  variable,  or  set  of  variables,  is  correlated 
with  both  predictors  and  the  criterion,  and  the  distributions  of  such 
variables  in  the  subgroups  are  not  similar  because  the  degrees  of  range 
restriction  vary,  then  differences  are  likely  to  surface  in  the  comparison 
of  regression  equations.  Linn  (1983)  illustrates  this  phenomenon  specifi¬ 
cally  in  the  context  of  predictive  bias,  although  the  problem  has  been 
familiar  to  personnel  psychologists  for  many  years.  It  too  can  be  under¬ 
stood  as  a  slightly  different  reflection  of  the  specification  error 
discussed  previously.  Here  the  error  lies  in  failure  to  consider  the 
selection  process  as  contributing  to  possible  differences  between  predic¬ 
tion  systems  . 

In  addition  to  complicating  the  issue  of  predictive  bias,  the  effects 
of  incidental  selection  can  wreak  havoc  on  efforts  to  improve  the  selec¬ 
tion  process  through  assessments  of  the  accuracy  of  alternate  predictors 
of  criterion  performance.  As  illustrated  by  Dunbar  and  Linn  (1985),  a 
variable  that  was  not  used  for  selecting  persons  from  the  applicant  pool 
may  appear  to  be  a  better  predictor  of  performance  than  the  actual  selec¬ 
tion  variable  simply  because  its  range  is  less  restricted  than  the  range 
of  the  selection  variable.  For  this  reason,  judgments  of  the  quality  of 
alternate  predictor  variables  on  the  basis  of  selected  samples  must  often 
be  tempered  by  careful  consideration  of  the  sample  selection  process. 


5 


Research  on  personnel  selection  in  the  military  is  typically  con¬ 
ducted  in  the  face  of  problems  such  as  those  described  above.  The  purpose 


of  this  paper  is  to  compare  the  regressions  of  performance  criteria  on 
selection  composites  from  the  Armed  Services  Vocational  Aptitude  Battery 
(ASVAB) ,  Forms  8,  9  and  10  for  a  host  of  technical  training  programs  in 
the  Marine  Corps.  In  addition  to  the  usual  comparisons  of  unadjusted 
estimates  of  slopes  and  intercepts,  comparisons  will  be  made  of  estimates 
that  have  been  corrected  for  the  possible  effects  of  incidental  selection 
by  two  adjustment  procedures.  The  intent  of  examining  both  unadjusted  and 
adjusted  estimates  is  twofold.  First,  it  will  provide  an  indication  of 
the  magnitude  of  differences  that  could  be  considered  due  to  different 
degrees  of  range  restriction  on  the  predictor  variables  of  interest. 
Second,  it  will  provide  an  alternative  view  of  the  accuracy  of  various 
ASVAB  selection  composites  for  heterogeneous  training  programs  that  is 
less  influenced  by  the  fact  that  selected  samples  were  used  in  the 
calculat ions . 


Related  Research 

Comparisons  of  the  criterion-related  validity  of  selection  instru¬ 
ments  across  job  categories  have  been  enriched  by  developments  in  the  meta 
analysis  of  locally-based  validity  studies  known  to  personnel 
psychologists  under  the  generic  heading  of  validity  generalization.  No 
attempt  is  made  here  to  review  the  vast  amount  of  work  done  in  this  area 
during  the  past  five  years  (cf.  Linn  &  Dunbar,  1985).  In  general,  this 
work  focuses  on  identifying  what  are  considered  artifactual  sources  of 
variability  in  the  observed  predictive  validities  of  selection  tests  used 


for  screening  applicants  for  jobs  or  admission  to  educational  programs. 

Of  the  artifactual  sources  of  variation  in  observed  validity  coefficients 
that  are  usually  addressed  in  validity  generalization  research,  the  one 
most  relevant  to  the  concerns  of  the  present  study  is  that  due  to  varying 
degrees  of  selectivity  in  the  technical  training  program  for  which 
validity  evidence  is  sought.  This  study  differs  from  most  empirical  work 
in  the  validity  generalization  tradition  in  its  focus  on  regression  equa¬ 
tions  in  addition  to  validity  coefficients,  so  it  might  be  more 
appropriate  to  term  the  present  study  one  of  relationship  generalization 
and,  moreover,  one  that  considers  only  one  of  the  several  sources  of 
situational  specificity  in  regression  equations  mentioned  in  the  original 
developments  of  Schmidt  and  Hunter  (1977). 

In  most  validity  generalization  research,  the  standard  approach  to 
range  restriction  involves  the  use  of  Pearson's  correction  for  explicit 
selection  on  the  predictor  described  by  Thorndike  (1949).  This  results  in 
an  adjusted  predictor-criterion  correlation  that  is  higher  than  the 
original  value  as  a  function  of  the  ratio  of  the  standard  deviations  in 
the  unselected  population  and  selected  sample.  When  the  selection  process 
is  known  to  be  based  on  other  variables  in  addition  to  the  predictor  of 
immediate  concern,  and  such  variables  are  positively  correlated  with  the 
predictor  and  criterion,  this  adjustment  is  likely  to  be  conservative, 
i.e.  underestimate  the  population  correlation  (Linn,  1968;  Linn,  Harnisch 
&  Dunbar,  1981a).  This  has  led  some  researchers  to  suggest  the  use  of 
Lawley's  multivariate  adjustment  procedures,  which  accomodate  multiple 
predictors  and  criteria  in  addition  to  providing  corrections  for  the 
regression  slopes  and  intercepts  that  are  also  affected  by  selection  on  a 


third  variable. 


An  issue  of  great  concern  regarding  multivariate  correction  proce¬ 
dures  is  their  accuracy  under  conditions  in  which  the  assumptions  of 
linearity  and  homoscedast ic ity  of  the  regression  of  incidental  on  explicit 
selection  variables  are  violated,  particularly  when  selection  is  severe. 
Lord  and  Novick  (1968)  were  among  the  first  to  caution  against  heavy 
reliance  on  adjustment  procedures  because  of  potential  overcorrect  ions  due 
to  reduced  variance  around  the  regression  line  at  extreme  predictor 
scores.  This  was  considered  an  acute  problem  whenever  the  ratio  of  selec¬ 
tion  variable  standard  deviations  in  the  unselected  population  and 
selected  sample  exceeded  1.4.  Subsequent  empirical  studies  have  docu¬ 
mented  such  overcorrect  ions  in  the  presence  of  heteroscedasticity  and 
extreme  degrees  of  range  restriction  (Novick  &  Thayer,  1969;  Greener  & 
Osburn ,  1979;  Dunbar,  1983).  However,  simulation  studies  have  also  shown 
that  the  tendency  toward  overcorrection  can  be  overshadowed  by  a  com¬ 
plementary  tendency  toward  undercorrection  when  the  slope  of  the 
regression  line  decreases  at  extreme  scores  on  the  predictor.  When  the 
effects  of  these  types  of  non-linearity  and  heteroscedasticity  are  con¬ 
sidered  simultaneously,  the  result  is  often  a  conservative  estimate  of  the 
population  correlation  (Dunbar,  1983).  The  same  trend  has  been  found  in 
corrections  of  regression  slopes  and  intercepts.  The  question  of  violated 
assumptions  is  considered  at  length  by  Dunbar  and  Linn  (1985),  who  sum¬ 
marize  the  literature  on  this  matter  and  argue  that  multivariate 
adjustments  are  likely  to  be  conservative  in  the  context  of  validating 
selection  tests  in  the  military. 

Further  support  for  the  use  of  adjustment  procedures  comes  from 
comparing  their  performance  to  that  of  methods  for  handling  range  restric¬ 
tion  that  stem  from  different  assumptions  about  the  regressions  of 
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substantively  interesting  variables  on  selection  variables.  Linn, 
Harnisch  and  Dunbar  (1981b)  and  Linn  and  Hastings  (1984)  illustrate  the 
use  of  an  adjustment  procedure  that  is  based  on  empirical  relationships 
between  observed  validity  coefficients  and  characteristics  of  predictor 
score  distributions  that  reflect  the  presence  of  range  restriction.  The 
latter  study,  in  particular,  found  point  estimates  of  the  predictive 
validity  of  the  Law  School  Admission  Test  (LSAT)  obtained  by  the 
empirically-based  procedure  to  be  quite  similar  to  those  obtained  by  the 
multivariate  adjustment  procedure.  In  a  similar  vein,  Braun  and 
Szatrowski  (1984),  used  an  elaborate  method  for  rescaling  criterion  vari¬ 
ables  used  in  locally-based  validity  studies  that  links  or  equates 
criterion  scores  in  similar  groups  to  create  what  they  call  a  universal 
criterion  scale.  Criterion  scores  on  the  universal  scale  were  then  used 
to  validate  the  LSAT  and  undergraduate  grades  as  predictors  of  law  school 
performance.  Again,  the  results  showed  the  universal  scale  approach  to 
give  estimates  of  predictive  validity  that  were  similar  to  those  provided 
by  the  Pearson-Lawley  adjustments ,  even  though  the  two  methods  are  based 
on  quite  different  assumptions. 

Even  though  a  negative  bias  may  remain  when  using  the  multivariate 
corrections,  it  doesn't  follow  that  a  population  value  cannot  be  overes¬ 
timated  with  an  adjustment  procedure.  It  is  well  known  that  the  mean 
squared  errors  of  adjusted  slopes,  intercepts,  and  correlations  can  be 
much  larger  than  those  of  unadjusted  values  (Dunbar  &  Linn,  1985). 
Appropriate  caution,  therefore,  needs  to  be  emphasized  in  the  interpreta¬ 
tion  of  any  coefficient  that  has  been  'corrected'  for  range  restriction. 
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Method 


The  data  used  in  this  investigation  consist  of  ASVAB  subtest  and 
composite  scores  and  final  course  grades  for  Marine  Corps  trainees  in  27 
different  technical  training  programs  leading  to  specific  job  classifica¬ 
tions  upon  completion  of  training.  Table  1  lists  the  training  specialties 
included  along  with  sample  sizes  and  relevant  ASVAB  selection  composites. 
Training  programs  are  categorized  on  the  basis  of  the  ASVAB  composite  used 
for  selection,  forming  training  cohorts  such  as  mechanical  or  clerical 
specialties.  All  training  programs  included  in  this  study  report  perfor¬ 
mance  measures  on  a  nominal  scale  of  0  to  100.  Sample  sizes  ranged  from 
109  to  1791,  with  a  mean  of  336.  In  order  to  restrict  possible  sources  of 
differences  between  regression  equations  for  individual  programs,  only 
white  males  were  included  in  the  samples  for  which  analyses  are  reported. 
Studies  showing  the  importance  of  gender-  and  race-related  differences  for 
military  technical  training  data  have  recently  been  conducted  by  Dunbar 
and  Novick  (1985),  Curtis,  Foley  and  Monzon  (1985),  and  Houston  and  Novick 
(1985).  This  problem  receives  further  attention  in  discussion  of  the 
results  of  the  present  study. 


Insert  Table  1  About  Here 


Procedure 


All  analyses  were  conducted  using  final  course  grades,  standardized 
within  each  training  program,  as  criteria  and  selected  ASVAB  composites  as 


predictors.  Least-squares  and  Bayesian  m-group  estimates  of  slopes  and 
intercepts  from  the  regression  of  standardized  course  grades  on  the  ASVAB 
composites  were  first  determined  with  no  adjustments  made  for  the  possible 
effects  of  incidental  selection.  These  were  then  compared  to  estimates 
derived  from  two  procedures  for  correcting  regression  coefficients  for 
incidental  selection,  (1)  the  standard  Lawley  multivariate  correction 
formulas,  using  statistics  from  three  different  reference  populations  as 
estimates  of  the  variances  and  covariances  of  ASVAB  subtests  in  an  un¬ 
selected  group,  and  (2)  a  modification  of  the  Heckman  (1979)  approach  to 
incidental  selection,  using  a  logistic  instead  of  a  probit  regression  in 
the  first  stage  of  his  two-stage  procedure.  In  the  case  of  the  Heckman 
adjustments,  data  from  the  entire  sample  were  used  in  the  logistic  regres¬ 
sions  in  order  to  estimate  selection  terms  for  trainees  in  individual 


programs . 

The  modification  of  Heckman's  approach  to  sample  selection  bias  was 
straightforward.  In  the  original  Heckman  (1979)  two-stage  procedure,  a 
selection  term  is  estimated  for  each  observation  that  describes  the  chance 
of  the  observation's  being  lost  in  the  sample  selection  process.  This 
term  is  estimated  using  a  probit  regression  of  the  dichotomous  indicator 
of  presence  in  the  sample  (1  =  selected,  0  =  not  selected)  on  the  set  of 
posited  selection  variables.  Heckman  (1979)  specifically  used  a  hazard 
rate,  the  ratio  of  the  ordinate  of  the  normal  density  to  the  probability 
of  non-selection,  as  the  term  entered  along  with  predictors  of  substantive 
interest  in  a  least-squares  regression.  The  least-squares  regression  is 
adjusted  for  selection  bias  by  inclusion  of  a  selection  term  as  the 
formerly  'missing'  variable.  The  modification  used  in  this  study  simply 
substituted  a  logistic  regression  for  the  probit  regression  in  the  first 


stage  of  Heckman's  method  and  specified  the  selection  term  using  the 
relat ion 


log{[l  -  e(x)]/e(x)>  =  a  +  b'X, 


where  a  and  b  represent  the  parameters  of  the  logistic  regression  and  X 

represents  the  vector  of  posited  selection  variables.  This  term  simply 
represents  the  log-odds  of  non-selection  and  has  been  used  by  Rosenbaum 
and  Rubin  (1983a,  1983b)  for  bias  adjustment  in  observational  studies, 
with  e(x)  =  Prob(Y=l |X)  termed  the  propensity  score.  The  relevance  of 

their  work  to  selection  bias  in  criterion-related  validity  studies  is 
obvious,  but  seems  not  to  have  been  described  in  any  explicit  manner  in 
the  literature. 

In  the  present  study,  all  ASVAB  subtests  (that  in  varying  combina¬ 
tions  make  up  composite  measures)  were  treated  as  explicit  selection 
variables  in  both  the  Lawley  and  Heckman  adjustment  procedures.  Thus,  the 
covariance  matrix  of  subtests  in  each  of  the  designated  reference  popula¬ 
tions  was  obtained  in  implementing  Lawley's  adjustment.  In  the 
modification  of  Heckman's  approach,  the  subtests  were  considered  predic¬ 
tors  of  the  dichotomous  criterion  in  the  logistic  regression  stage.  No 
attempt  was  made  to  determine  empirically  an  optimal  combination  of  the 
subtests  in  fitting  the  logistic  regressions;  optimality  was  sacrificed 
in  the  interest  of  using  a  common  procedure  for  all  training  cohorts. 

Because  a  major  interest  was  the  effect  of  incidental  selection  on 


predicted  criterion  scores  (for  different  predictors  within  the  same 


program  and  for  common  predictors  across  different  programs)  there  was  an 
interest  in  evaluating  the  structure  of  predicted  scores  based  on  slopes 
obtained  under  various  adjustment  procedures  and  in  determining  dif¬ 
ferences  in  this  structure  for  the  various  training  programs.  This  was 
accomplished  by  means  of  a  three-way  or  weighted  metric  multidimensional 
scaling  (Carroll  &  Chang,  1970)  of  average  absolute  differences  between 
predicted  scores  obtained  from  the  six  ASVAB  selection  composites  using 
unadjusted  and  adjusted  regression  equations. 

Results 

The  principal  results  of  this  study  concern  the  estimates  of  regres¬ 
sion  slopes  and  intercepts  under  various  conditions.  The  Lawley 
adjustments  of  least-squares  and  m-group  coefficients  were  based  on  one  of 
three  potential  reference  groups:  (1)  the  data  base  of  Marine  Corps 
trainees  available  for  the  present  study  (a  surrogate  accession 
population),  (2)  the  1980  Youth  population  with  the  lower  10  percent  of 
the  AFQT  distribution  deleted  and  (3)  the  full  1980  Youth  population.  The 
modified  Heckman  results,  in  contrast,  base  any  adjustment  only  on  in¬ 
dividuals  in  the  present  database.  Because  complete  results  of  all 
regression  analyses  are  unwieldy,  only  highlights  will  be  discussed  in  the 
body  of  the  present  report. 

Trends  found  to  be  typical  of  many  of  the  findings  are  illustrated  in 
Figures  1  and  2,  which  contains  box-and-whisker  plots  of  the  distributions 
of  regression  slopes  in  the  least-squares  analysis  for  the  27  training 
programs.  In  these  plots,  the  box  represents  the  middle  30  percent  of  the 
distribution,  while  the  whiskers  extend  to  the  5th  and  95th  percentiles. 


The  plots  in  Figure  1  describe  results  of  the  unadjusted  least-squares 
analyses,  while  those  in  Figure  2  describe  results  of  the  unadjusted  in¬ 
group  analyses. 

The  distributions  of  unadjusted  slopes  given  in  Figure  1  depict  a 
scene  that  is  common  in  criterion-related  validity  studies  that  compare 
many  groups  or  job  classifications.  The  general  appearance  is  one  of 
selection  tests  that  lead  to  a  heterogeneous  set  of  predicted  criterion 
scores  depending  on  combinations  of  job  classifications  and  predictor 
variables.  Variance  explained  in  the  criterion  by  the  various  ASVAB 
composites  ranged  from  less  than  1  percent  for  the  clerical  composite,  CL, 
in  a  combat  specialty  to  29  percent  for  the  electrical  composite,  EL,  in  a 
clerical  specialty.  Although  it  was  not  the  case  that  the  selection 
composite  used  in  particular  specialty  areas  explained  the  least  variance 
in  the  criterion,  for  many  of  the  groups  under  examination  a  composite 
other  than  the  one  used  for  selection  had  the  appearance  of  yielding  more 
accurate  predictions,  in  terms  of  variance  accounted  for,  when  the  unad¬ 
justed  least-squares  equations  are  interpreted.  Incidental  selection 
effects  are  one  possible  explanation  for  this  outcome. 

As  can  be  seen  from  the  figure,  there  appears  to  be  substantial 
variation  in  the  sizes  of  performance  increases  as  a  function  of  ASVAB 
composite  scores.  Most  notable  in  this  regard  are  the  clerical  and 
electrical  composites,  CL  and  EL.  CL  appears  to  be  the  least  effective 
predictor  of  training  grades  over  all  job  classifications,  although  some 
exceptions  do  exist,  while  EL  appears  to  be  more  effective  than  any  other 
single  predictor  for  the  majority  of  groups.  Note  that  EL  is  not  used  as 
a  selection  composite  for  any  of  the  groups  included  in  this  study. 
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Insert  Figure  1  About  Here 


The  results  of  Bayesian  m-group  regression  analyses  are  illustrated 
by  the  plots  in  Figure  2.  The  estimates  of  slopes  depicted  in  the  figure 
were  derived  in  a  manner  similar  to  that  described  by  Dunbar,  Mayekava  and 
Novick  (1985),  which  treats  job  specialties  using  a  common  selection 
composite  as  an  exchangeable  sample  from  a  population  of  such  specialties. 
Thus,  the  five  clerical  training  programs,  eight  combat  programs,  and  so 
on,  were  grouped  and  regression  parameters  for  individual  programs  were 
estimated  simultaneously  within  each  of  the  resulting  groups.  Unlike  the 
Dunbar,  et  al.  analyses,  those  in  the  present  study  were  performed  using 
an  algorithm  similar  to  the  one  described  by  Rubin  (1980)  that  does  not 
assume  between  group  homoscedast icity  of  the  error  variances,  but  instead 
estimates  the  mean  and  variance  of  the  prior  distribution  for  the  error 
variance  from  the  data. 


Insert  Figure  2  about  here 


As  can  be  observed  from  the  plots  of  m-group  slopes,  there  was  much 
greater  homogeneity  in  the  slopes  of  regression  lines  for  the  various 
groups  using  the  Bayesian  approach.  A  similar  result  was  found  for  the 
intercepts.  Although  it  was  not  the  case  that  slopes  associated  with 
common  composites  across  groups  were  identical,  some  degree  of  shrinkage 

I 


of  estimates  toward  a  common  value  was  evident  from  the  results  of  the 
Bayesian  analysis.  Of  course,  this  is  to  be  expected  as  a  consequence  of 
the  principle  of  exchangeability  of  groups  using  the  same  ASVAB  composite 
for  recruit  selection.  Note  that  CL  continued  to  have  the  smallest  slope, 
on  average,  of  any  ASVAB  composite. 

The  effects  of  Lawley's  multivariate  corrections  for  slopes  and 
intercepts  affected  by  incidental  selection  are  summarized  in  Table  2, 
which  gives  observed  means  and  standard  deviations  of  the  distributions  of 
regression  slopes  across  programs  under  various  adjustment  conditions  for 
the  six  ASVAB  composites  considered  in  this  report.  Several  important 
findings  can  be  noted  with  respect  to  these  statistics.  The  first  is  the 
obvious  increase  in  average  slopes  as  the  reference  population  used  to 
effect  the  Lawley  procedure  broadens  in  the  range  of  talent  represented. 
When  all  trainees  in  the  present  data  base  are  considered  the  reference 
group,  the  increases  are  not  striking  in  magnitude.  However,  the  use  of 
the  1980  Youth  Population  (with  or  without  the  bottom  10  percent  on  AFQT) 
results  is  a  more  dramatic  increase,  on  average,  in  the  slopes  of  the 
regression  lines  associated  with  each  ASVAB  composite.  In  addition,  the 
pattern  of  increases  is  nearly  identical  for  the  adjustments  of  least- 
squares  and  m-group  coefficients.  The  second  outcome  of  interest  in  the 
table  concerns  the  standard  deviations  of  the  observed  and  adjusted 
coefficients.  Also  as  expected,  the  variability  of  slopes  across  training 
programs  increases  when  the  Lawley  adjustments  are  used.  However,  because 
the  m-group  coefficients  are  less  variable  to  begin  with,  the  increase 
noted  in  the  standard  deviations  for  the  adjusted  Bayesian  coefficients  is 
not  drastic.  In  all  cases,  the  standard  deviations  of  the  least-squares 
coefficients  are  greater  than  those  of  the  m-group  coefficients  when  the 


same  reference  population  is  used,  suggesting  that  a  useful  way  of  exer¬ 
cising  control  over  increases  in  the  variability  of  adjusted  values  is  to 
take  advantage  of  exchangeability  among  groups  when  it  can  be  assumed  to 


exist . 


Insert  Table  2  About  Here 


The  last  result  of  interest  in  Table  2  concerns  the  performance  of 
the  two-stage  procedure  marked  as  condition  MHK  for  modif ied-Heckman.  The 
average  slopes  obtained  with  this  adjustment  procedure  more  closely 
resemble  the  means  in  the  unadjusted  conditions,  either  least-squares  or 
m-group,  than  they  do  the  means  under  any  other  condition.  In  addition, 
the  standard  deviations  across  programs  in  several  cases  are  as  large  or 
larger  than  those  obtained  when  the  Lawley  correction  was  used  with  the 
leas t-s quares  coefficients.  In  other  words,  the  results  in  Table  2  indi¬ 
cate  that  this  approach,  as  implemented  in  the  present  study,  had  little 
effect  on  the  prediction  equations,  on  average,  but  at  the  same  time 
yielded  more  widely  varying  equations  for  individual  programs . 

Structural  Analysis  of  Predicted  Scores 

An  indication  that  differences  between  regression  equations  within 
programs  were  reduced  on  adjustment  for  incidental  selection  was  observed 
in  results  from  the  three-way  MDS  (INDSCAL)  analyses.  Because  the  struc¬ 
ture  of  only  six  predicted  scores  was  examined  for  each  group,  all  INDSCAL 


solutions  obtained  were  restricted  to  two  dimensions,  with  principal 
interest  in  the  extent  to  which  the  second  dimension  was  needed  to  explain 
either  the  structure  of  differences  between  predicted  scores  within  a 
training  program  or  to  explain  differences  between  programs  in  that 
structure . 

Table  3  presents  goodness-of-f it  statistics  for  the  INDSCAL  solutions 
under  the  nine  adjustment  conditions.  The  values  of  STRESS  are  based  on 
Kruskal's  (1964)  fit  statistic  for  multidimensional  scaling,  which  ap¬ 
proaches  zero  as  the  scaling  solution  obtained  becomes  a  better 
representation  of  the  observed  data.  The  values  of  RSQ  represent  the 
proportion  of  variance  in  the  observed  prediction  differences  explained  by 
the  two  dimensions  of  the  INDSCAL  solutions  and  are  included  for  ease  of 
interpretation. 

As  might  be  expected  on  the  basis  of  results  already  presented,  the 
two-dimensional  solutions  tend  to  fit  the  observed  data  better  as  the 
degree  of  adjustment  made  by  the  Lawley  correction  increases.  With 
respect  to  both  the  least-squares  and  m-group  coefficients,  values  of 
STRESS  steadily  decrease  as  the  reference  population  used  to  adjust  the 
regressions  changes  from  the  surrogate  accession  group,  to  the  truncated 
1980  Youth  group,  to  the  entire  1980  Youth  group.  These  results  suggest 
that  the  structure  of  predicted  scores  is  more  easily  explained  by  a 
solution  with  a  small  number  of  dimensions  when  adjustments  have  been  made 
for  incidental  selection.  In  other  words,  the  ASVAB  composites  tend  to 
give  more  similar  indications  of  expected  performance  when  expectations 
recognize  the  possible  effects  of  incidental  selection.  The  value  of 
STRESS  associated  with  the  INDSCAL  solution  for  modified  Heckman  analysis 
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again  reflects  the  erratic  performance  of  the  method  as  implemented  in 
this  study. 

Similarities  between  training  programs  in  the  structure  of  prediction 
differences  are  shown  in  Figure  3,  which  contains  plots  of  the  weights 
estimated  for  individual  programs  in  the  INDSCAL  solution  for  two  adjust¬ 
ment  conditions.  These  weights  describe  the  salience  of  each  dimension  in 
determining  the  distance  between  points  that  represent  selection  com¬ 
posites  in  the  two-dimensional  solution,  and  in  the  present  context  they 
provide  a  means  of  comparing  the  27  training  programs.  The  sums  of  the 
squared  weights  themselves  equal  the  proportion  of  variance  explained  by 
the  solution  for  particular  programs  and  is  represented  geometrically  by 
the  distance  of  points  in  the  plots  from  the  origin.  Figure  2  gives 
results  from  the  unadjusted  m-group  regression  equations  and  the  same 
equations  adjusted  for  incidental  selection  using  the  full  1980  Youth 
Population  as  the  base  group. 


Insert  Figure  3  About  Here 


The  plots  in  Figure  3  illustrate  the  largest  contrast  observed  be¬ 
tween  an  unadjusted  and  adjusted  solution.  The  plot  for  the  unadjusted 
condition  shows  greater  differences  between  programs  in  the  structure  of 
differences  between  predicted  scores  obtained  from  the  six  selection 
composites.  Differential  weighting  of  the  two  dimensions  in  the  MDS 
solution  appears  to  be  more  of  a  rule  than  an  exception  for  the  unadjusted 


regression  equations. 


In  contrast,  the  plot  for  the  adjusted  equations 
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shows  a  greater  tendency  for  larger  weights  on  the  first  dimension,  and 
only  a  few  programs  with  sizeable  proportions  of  variance  accounted  for  by 
the  second  dimension.  Plots  for  other  adjustment  conditions  were  similar 
in  revealing  smaller  contrasts  across  training  programs. 


Discuss  ion 

The  contrasts  illustrated  in  this  paper  provide  an  indication  of  the 
possible  effects  of  incidental  selection  on  the  observed  regessions  of 
training  course  grades  on  selection  tests  in  common  use  in  the  military. 
With  a  homogeneous  subject  pool  as  a  base  group,  ASVAB  composite  variables 
were  shown  to  yield  increasingly  similar  predictions  of  criterion  perfor¬ 
mance  after  adjustment  for  selection  effects.  However,  even  with  the  full 
1980  Youth  Population  as  a  reference  group,  the  INDSCAL  analyses  did 
reveal  small  program-to-program  differences  in  the  structure  of  predicted 
scores,  and  these  differences  seemed  related  to  the  salience  of  the  second 
dimension  in  the  MDS  solutions.  On  inspection  it  was  found  that  four  of 
the  six  training  programs  with  the  largest  weights  on  dimension  II  were 
Aviation  specialties,  while  four  of  the  six  with  the  smallest  weights  were 
General  or  Combat  specialties.  The  former  group  differs  in  having  greater 
emphasis  placed  on  specific  subject  matter  such  as  electrical  and  mechani¬ 
cal  information. 

Following  adjustment  for  indirect  range  restriction  the  average 
slopes  for  ASVAB  selection  composites  were  quite  similar,  with  the  lone 
exception  of  CL,  the  Clerical/Administrative  composite,  which  in  all 
analyses  appeared  to  be  the  poorest  predictor  of  success  in  training. 


This  finding  may  be  due  to  the  fact  that  two  of  the  four  subtests  used  to 
form  this  composite  are  speeded  (Numerical  Operations  and  Coding  Speed). 
Speededness  has  often  appeared  as  a  group  factor  in  studies  of  human 
abilities.  It  is  likely  that  this  factor  is  not  well  represented  in  the 
criterion  variable  used  in  the  present  study  and  its  absence  may  in  part 
explain  the  relatively  poor  performance  of  CL  as  a  predictor. 

The  two-stage  approach  adapted  from  Heckman  (1979)  did  not  fare  well 
as  a  routine  adjustment  procedure  for  incidental  selection  effects.  Such 
disappointing  results  may  well  be  due  to  the  fact  that  a  common  set  of 
selection  variables  was  used  in  the  logistic  regression  stage  and  the 
selection  process  for  individual  programs  was  thereby  modeled 
inaccurately.  They  may  also  be  a  reflection  of  the  fact  that  the  Heckman 
approach  can  be  subject  to  large  ammounts  of  sampling  error,  particularly 
in  cases  where  the  variables  used  in  the  logistic  regression  stage  are 
closely  related  to  those  used  in  the  second  stage  and  where  sample  sizes 
not  extremely  large.  Both  of  these  conditions  were  present  in  this  study 
In  contrast,  the  behavior  of  the  Lawley  corrections  appeared  to  be  much 
more  regular  and  predictable. 

Final  comments  concern  the  generalizab i lit y  of  the  findings  of  the 
present  study.  Although  greater  similarities  between  predicted  scores 
were  found  following  adjustment  for  incidental  selection,  they  were  ob¬ 
tained  using  data  from  a  homogeneous  subject  pool.  To  the  extent  that 
subject  variables  such  as  sex  and  race  interact  with  ASVAB  composites  in 
the  prediction  of  training  success  (cf.  Dunbar  &  Novick,  1985;  Houston  & 
Novick,  1985),  greater  differences  between  regression  lines  using 
heterogeneous  groups  would  be  likely  even  after  adjustment  for  incidental 
selection.  The  intent  of  the  present  analysis  was  to  show  how  much 


similarity  might  be  expected  for  subjects  with  common  backgrounds  and  even 
in  this  case  some  differences  between  groups  could  be  detected. 

A  second  concern  regarding  generalizability  relates  to  the  choice  of 
a  reference  population  on  which  to  base  adjustments  for  range  restriction. 
The  choice  here  probably  depends  on  the  role  one  sees  the  Lawley  adjust¬ 
ments  playing  in  test  validation.  In  large  organizations  in  both 
government  and  industry,  correcting  for  restriction  of  range  serves  a  need 
for  comparability  as  well  as  a  need  for  reduced  bias  in  parameter 
estimation.  It  was  in  the  interest  of  comparability,  across  services, 
that  Dunbar  and  Linn  (1985)  suggested  the  use  of  the  1980  Youth  Population 
as  a  reference  group.  One  should  recognize,  however,  that  an  accession 
population  might  be  more  appropriate  in  some  settings. 
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Training  Programs ,  Sample  Sizes  and  Operational 
Selection  Composites 


Training  Program _ Sample  Size _ Composite 


Administrative  Cleric 

205 

Clerical 

(CL) 

Communications  Center 

180 

Clerical 

(CL) 

Supply  Stock 

446 

Clerical 

(CL) 

Aviation  Supply 

269 

Clerical 

(CL) 

Finance  Records  Clerk 

135 

Clerical 

(CL) 

Rifleman  A 

1791 

Combat 

(CO) 

Rifleman  B 

1013 

Combat 

(CO) 

Machine  Gunner  A 

391 

Combat 

(CO) 

Machine  Gunner  B 

141 

Combat 

(CO) 

Mortor  Man  A 

407 

Combat 

(CO) 

Mortor  Man  B 

172 

Combat 

(CO) 

Antitank  Assault  A 

439 

Combat 

(CO) 

Antitank  Assault  B 

170 

Combat 

(CO) 

Fire  Control 

179 

Field  Artillery 

(FA) 

Amphibian  Crew 

286 

Field  Artillery 

(FA) 

Ammunition  Storage 

118 

General  Technical 

(GT) 

Basic  Food  Service 

341 

General  Technical 

(GT) 

Aviation  Ordnance 

308 

General  Technical 

(GT) 

Eng.  Equipment  Mechanic 

136 

Mechanical 

Ma intenance 

(MM) 

Eng.  Equipment  Operator 

362 

Mechanical 

Maintenance 

(MM) 

Combat  Engineer 

138 

Mechanical 

Ma intenance 

(MM) 

Tracked  Vehicle  Repair 

122 

Mechanical 

Maintenance 

(MM) 

Basic  Auto  Mechanic 

343 

Mechanical 

Ma  intenance 

(MM) 

Aviation  Machinist 

421 

Mechanical 

Maintenance 

(MM) 

Aviation  Equipment  Mechanic 

109 

Mechanical 

Maintenance 

(MM) 

Basic  Helicopter 

320 

Mechanical 

Maintenance 

(MM) 

Aviation  Crash  Crew 

129 

Mechanical 

Maintenance  (MM) 
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Table  2 

Means  and  Standard  Deviations  of  Regression  Slopes  under 
Various  Adjustment  Conditions 


Adjustment  Selection  Composite 

Condit  ion _ CL _ CO _ EL 


LSO 

.0155 

( .0062) 

.0252 

(.0065) 

.0323 

(.0107) 

LSI 

.0181 

(.0065) 

.0283 

( .0088) 

.0341 

(  .0119) 

LS2 

.0243 

(  .0077) 

.0307 

(  .0089) 

.0344 

( .0119) 

LS3 

.0255 

(  .0076) 

.0337 

(  .0101) 

.0368 

( .0122) 

MGO 

.0151 

(  .0045) 

.0254 

( .0043) 

.0314 

(.0077) 

MG1 

.0178 

( .0054) 

.0284 

(  .0075) 

.0337 

(.0095) 

MG2 

.0239 

( .0070) 

.0307 

(.0083) 

.0341 

(.0105) 

MG3 

.0253 

(.00*1) 

.0337 

(.0091) 

.0365 

(.0111) 

MHK 

.0179 

(.0149) 

.0240 

(.0076) 

.0327 

(.0108) 

FA 

GT 

MM 

LSO 

.0296 

(  .0089) 

.0271 

(  .0092) 

.0283 

(.0098) 

LSI 

.0309 

( .0104) 

.0286 

(  .0094) 

.0290 

(  .0107) 

LS2 

.0333 

(  .0106) 

.0320 

(.0101) 

.0307 

(.0106) 

LS3 

.0333 

(.0102) 

.0323 

(.0101) 

.0336 

(.0107) 

MGO 

.0297 

(.0064) 

.0268 

(.0061) 

.0281 

(.0072) 

MG1 

.0308 

(  .0086) 

.0286 

(  .0071) 

.0290 

(.0093) 

MG2 

.0333 

(.0094) 

.0320 

(  .0086) 

.0305 

(  .0096) 

MG3 

.0333 

(  .0093) 

.0322 

(.0089) 

.0336 

(.0099) 

MHK 

.0286 

(.0099) 

.0266 

(  .0130) 

.0279 

(.0101) 

Note:  LS  =  Least-Squares,  MG  **  M-Group 

0  =  No  Adjustment,  1  =  Adjustment  with  Present  Data  Base, 

2  =  Adjustment  with  truncated  1980  Youth  Population,  and 

3  =  Adjustment  with  full  1980  Youth  Population. 

MHK  *  Modified  Heckman  Adjustment. 
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Table  3 

Root  Mean  Square  Goodnesa-of-f it  Statistics 
for  Nine  INDSCAL  Solutions 


Adjustment 

Condition 

STRESS 

RSO 

LSO 

.183 

.737 

LSI 

.178 

.759 

LS2 

.163 

.832 

LS3 

.134 

.883 

MGO 

.193 

.729 

MGl 

.168 

.781 

MG2 

.160 

.836 

MG3 

.133 

.884 

MHK 

.246 

.427 

Note:  LS  =  Least-Squares,  MG  «=  M-Group 

0  =  No  Adjustment,  1  ■  Adjustment  with  Present  Database 

2  =  Adjustment  with  Truncated  1980  Youth  Population,  and 

3  =  Adjustment  with  Full  1980  Youth  Population. 

MHK  =  Modified  Heckman  Adjustment 
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